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Abstract. We consider communities whose vertices are predominantly con- 
(""*) nected, i.e., the vertices in each community are stronger connected to other com- 

CN| munity members of the same community than to vertices outside the community. 

Flake et al. introduced a hierarchical clustering algorithm that finds such predent- 
in inantly connected communities of different coarseness depending on an input 

parameter. We present a simple and efficient method for constructing a cluster- 
<H ing hierarchy according to Flake et al. that supersedes the necessity of choosing 

£f>^ feasible parameter values and guarantees the completeness of the resulting hier- 

archy, i.e., the hierarchy contains all clusterings that can be constructed by the 
original algorithm for any parameter value. However, predominantly connected 
communities are not organized in a single hierarchy. Thus, we develop a frame- 
work that, after precomputing at most 2(n — 1) maximum flows, admits a linear 
^ time construction of a clustering fi(S) of predominantly connected communities 

Q that contains a given community S and is maximum in the sense that any fur- 

'— 1 ther clustering of predominantly connected communities that also contains S is 

hierarchically nested in (2(S). We further generalize this construction yielding a 
clustering with similar properties for k given communities in O(kn) time. This 
admits the analysis of a network's structure with respect to various communities 
in different hierarchies. 

1 Introduction 

in 

f^**) There exist many different approaches to find communities in networks, many of which 

CO are inspired by graph clustering techniques originally developed for special applica- 

tions in fields like physics and biology. Graph clustering is based on the assumption 
that the given network is a compound of dense subgraphs, so called clusters or com- 
J^j munities, that are only sparsely connected among each other, and aims at finding a 

clustering that represents these subgraphs. However, evaluating the quality of a found 
clustering is often difficult, since there are no generally applicable criteria for good clus- 
terings and clustering properties that are well interpretable in the network's context are 
rarely guaranteed. In this work we thus focus on predominantly connected communities 
in undirected edge-weighted graphs. Predominant connectivity is easy to interpret and 
guarantees that only vertices whose membership to a community is clearly indicated by 
the networks's structure are assigned to a community. The latter is in particular desired 
if the analysis of the community structure is meant to support costly or risky decisions. 

Contribution and Outline. We discuss different types of predominantly connected 
communities (cp. Table 1 for an overview) in Section 2 and argue that considering 
source communities (SCs) in networks is reasonable. We further give a characterization 



Table 1. Overview of different types of predominantly connected communities. The columns to 
the right describe the relations between the types in terms of inclusion. 
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of SCs and introduce basic nesting properties. In Section 3, we review the cut clustering 
algorithm by Flake et al. [2], which takes an input parameter a and decomposes a given 
network into SCs, each of which providing an intra-cluster density of at least a and 
an inter-cluster sparsity of at most a. At the same time, a controls the coarseness of 
the resulting clustering such that for varying values the algorithm returns a clustering 
hierarchy. Flake at al. refer to Gallo et al. [3] for the question how to choose a such 
that all possible hierarchy levels are found. However, they give no further description 
how to extend the approach of Gallo et al., which finds all breakpoints of a for a single 
parametric flow, to a fast construction of a complete hierarchy. They just propose a 
binary-search approach to find good values for a. We introduce a parametric-search 
approach that guarantees the completeness of the resulting hierarchy and exceeds the 
running time of a binary search-based approach, whose running time strongly depends 
on the discretization of the parameter range. 

Experimental evaluations further showed that the cut clustering algorithm finds 
meaningful clusters in real-world instances [2], but yet, it often happens that even in 
a complete hierarchy non-singleton clusters are only found for a subgraph of the initial 
network, while the remaining vertices stay unclustered even on the coarsest non-trivial 
hierarchy level [2, 6]. Motivated by this observation, in Section 4, we develop a frame- 
work that is based on a set M(G) of n < \M{G)\ < 2(n - 1) maximal SCs in the 
graph G, i.e., each further SC is nested in a SC in M(G), and is represented by a spe- 
cial cut tree, which can be constructed by at most 2(n — 1) max-flow computations. 
After computing M(G) in a preprocessing step, the framework efficiently answers the 
following queries: (i) Given an arbitrary SC S, what does a clustering f2(S) look like 
that consists of S and further SCs such that any SC not intersecting with S is nested 
in a cluster of f2(S)l In particular, f2(S) is maximum in the sense that any cluster- 
ing of SCs that contains £ is hierarchically nested in f2(S). We show that f2(S) can 
be determined in linear time, (ii) Given k disjoint SCs, which is the maximal cluster- 
ing f2(Si, . . . , Sk) that contains the given SCs, is nested in each O(Si), i — l,...,k, 
and guarantees that any clustering of SCs that also contains the given ones is nested in 
0(S\, . . . , <Sfc)? Computing f2(S\, . . . , Sk) takes O(kn) time. These queries allow to 
further examine the community structure of a given network, beyond the complete clus- 
tering hierarchy according to Flake et al. We exemplarily apply both queries to a small 
real world network, thereby finding a new clustering beyond the hierarchy that contains 
all non-singleton clusters of the best clustering in the hierarchy but far less singletons. 

Preliminaries. Throughout this work we consider an undirected, weighted graph G = 
(V, E, c) with vertices V, edges E and a positive edge cost function c, writing c(u, v) 
as a shorthand for c({u, v}) with {u, v} e E. Whenever we consider the degree deg(w) 
of v G V, we implicitly mean the sum of all edge costs incident to v. A cut in G 
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is a partition of V into two cut sides S and V \ S. The cost c(S, V \ S) of a cut 
is the sum of the costs of all edges crossing the cut, i.e., edges {u, v} with u E S, 
v G V" \ 5. For two disjoint sets A, £? C we define the cost c(A, B) analogously. 
Two cuts are non-crossing if their cut sides are pairwise nested or disjoint. Two sets 
S, T C V are separated by a cut if they lie on different cut sides. A minimum S'-T-cut 
is a cut that separates S and T and is the cheapest cut among all cuts separating these 
sets. We call a cut a minimum separating cut if there exists an arbitrary pair {S, T} for 
which it is a minimum S-T-cut. We identify singleton sets with the contained vertex 
without further notice. We further denote the connectivity of {S, T} C 2 V by A(S I , T), 
describing the cost of a minimum S-T-cut A clustering Q of G is a partition of V into 
subsets C 1 , . . . , C k , which define vertex-induced subgraphs, called clusters. A cluster 
is trivial if it corresponds to a connected component. A vertex that forms a singleton 
cluster although it is no singleton in G, is unclustered. A clustering is trivial if it consists 
of trivial clusters or if k = n. A hierarchy of clusterings is a sequence Q\ < ■ ■ ■ < Q r 
such that Qi < Qj implies that each cluster in Q{ is a subset of a cluster in Qj . We 
say Qi < Qj are hierarchically nested. A clustering Q is maximal with respect to a 
property V if there is no other clustering Q' with property V and Q < Q'. 

2 Predominantly Connected Communities 

In the context of large web-based graphs, Flake et al. [1] introduce web com- 
munities (WCs) in terms of predominant connectivity of single vertices: A set 
S C V is a web community if c({u},S \ {u}) ^ c({u},V \ S) for all 
u G S. Web communities are not necessarily connected (cp. Fig. 1) and de- 
composing a graph into k web communities is NP-complete [2]. Extending 
the predominant connectivity from vertices to arbitrary subsets 
yields extreme sets (ESs), which satisfy a stricter property that 
guarantees connectivity and gives a good intuition why the ver- 
tices in ESs belong together: A set S C V is an extreme set if 
c(U, S\U)> c(U, V\S) for all U C S. The extreme sets in a Fig. 1. Unconnected 
graph can be computed in 0(nm + n 2 log n) time with the help web community (left) 
of maximum adjacency orderings [9]. They form a subset of the maximal components 
of a graph, which subsume vertices that are not separated by cuts cheaper than a certain 
lower bound. Maximal components are either nested or disjoint and can be deduced 
from a cut tree, whose construction needs n — 1 maximum flow computations [4]. They 
are used in the context of image segmentation by Wu and Leahy [11]. 

In, for example, social networks, we are also interested in communities that sur- 
round a designated vertex, for instance a central person. Complying with this view, 
source communities (SCs) describe vertex sets where each subset that does not contain 
a designated vertex is predominantly connected to the remainder of the group: A set 
S C V is a SC with source s G S if c(U, S\U)> c{U, V \ S) for all U C S \ {s}. 
The members of a SC can be interpreted as followers of the source in that sense that 
each subgroup feels more attracted by the source (and other group members) than by 
the vertices outside the group. The predominant connectivity of SCs implements a close 
relation to minimum separating cuts. In fact, SCs are characterized as follows. 
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Lemma 1. A set S c V is a SC ofs e S iff there is T C y\5 swcft f/iaf (S, V\S) is the 
minimum s-T-cut in G that minimizes the number of vertices on the side containing s. 

Proof. (=*►): If S is a SC ofs, (S, V\S) is a minimum s-T-cut for T = V\S. Otherwise, 
a cheaper s-T-cut would split S into U and S\U 3 s with c(Z7, S\U) < c(U, V \ S), 
which is a contradiction. The cut (S, V \ S) further minimizes the number of vertices 
on the side containing s, since otherwise a "smaller" cut would induce a set U C S, 
s <£ U, with c(U, S\U)= c{U, V \ S). 

(<=): If (S, V \ S) is a minimum s-T-cut with T C V \ S, then it is c(U, S\U) < 
c(U,V\S) for all U C S\{s}. Otherwise, (S\U,V\(S\U)), which also separates s 
and T, would be a cheaper s-T-cut. If (S,V \ S) further minimizes the number of 
vertices on the side containing s, it is c(U, S\U) < c(U, V \ S) for all U C S \ {s}. 
Otherwise, (S \ U, V \ (S \ U)), would be a minimum s-T-cut with a smaller side 
containing s. □ 

Based on this characterization, we introduce some further no- ■ 
tations and two basic lemmas on nesting properties of SCs, which C ^3^^^^^o 
we will mainly use in Section 4. Note that a minimum s-T-cut g^wT^f 

in G must not be unique, however, the minimum s-T-cut that min- v$&-/ Lo 

imizes the number of vertices on the side containing s is unique. •^JE&J^' 
We call such a cut, which induces a SC S, a community cut, S S'r?iffio§W/ 
the SC of s with respect to T and T the opponent of s. Hence, 
SC : V x 2 V -> 2 V , SC{s, T) h-> {the SC of s with respect to T} 
is well defined providing SC(s,T) as future notation. The corre- Indecisive 
sponding maximum flow between s and T also induces an opposite vert i ces (w te). 
SC S' := SC(T, s), if we consider T as a compound node. If the community cut is the 
only minimum s-T-cut, it is S' = V \ S. Otherwise, X := V \ (S U S") ^ and the 
vertices in X are neither predominantly connected within S U X nor within S' U X, i.e., 
c(U, SUX) < c(U, V\ (SUX)) for all U C X (analogously for S'). In, for example, a 
social network this can be interpreted as follows. Whenever s and the group T become 
rivals, the network decomposes into followers of s (in S), followers of T (in 5") and 
possibly some indecisive individuals in V \ (S U S'). Figure 2 exemplarily shows two 
indecisive vertices in the (unweighted 1 ) karate club network gathered by Zachary [12]. 
Note that a SC can have several sources, and a vertex can have different SCs w.r.t. dif- 
ferent opponents. The SCs of a vertex are partially nested as stated in Lemma 2, which 
is a special case of (2i) of Lemma 3 summarizing the intersection behavior of arbitrary 
SCs. See Figure 3 for illustration and an example of neither nested nor disjoint SCs. 

Lemma 2. Let S denote a SC ofs andTnS = 0. Then S C SC{s, T). 

Proof. Since Case (2i) of Lemma 3 admits si = S2, this lemma directly follows with 
s = s u S = Sx,T = T 2 and S' = S 2 - □ 

As a consequence, each SC S ^ V is nested in a SC S' that is a SC w.r.t. a single 
vertex t, while any SC S with S' S S contains t. In this sense, SCs w.r.t. single vertices 
are maximal. We denote the set of maximal SCs in G by M(G). 



1 Zachary considers the weighted network and therein the minimum cut that separates the two 
central vertices of highest degree (black). In the weighted network this cut is unique. 
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Lemma 3. Consider Si := SC(s 1 ,Ti) and S 2 ■= SC(s 2 , T 2 ). 

(1) If{s 1 ,s 2 } n (5i n S 2 ) = 0, then S x n S 2 = 0. 

(2) IfT 2 n 5i = and si G S 2 , then S 1 C S 2 (i). If further T\ n S 2 = and s 2 e Si, 
r/ien Si = S 2 (z'z'7 

(3) Otherwise, Si and S 2 are neither nested nor disjoint. 

Proof. Proof of (1): 

First recall that Si n T x = S 2 n T 2 = 0. Now suppose Si n S 2 ^ (cp.Figure 3(a)). 
Since U := (Si n S 2 ) C Si with si ^ 77 it is deg(77)| Si ^ de g( c/ )l(y\s 1 )uc/- This is 
equivalent to c(C/, V \ Si) < c{U, Si \ U) and it follows, since (S 2 \ 17) C (V \ Si), 

c(U, S 2 \U)< c(U, V \ Si) < c(C/, Si \ U). (1) 

We apply inequality (1) in order to show that the cut (S 2 \ U, V \ (S 2 \ U)), which also 
separates s 2 and T 2 , is cheaper than the community cut inducing S 2 , which leads to a 
contradiction. 

We represent the costs of the two cuts as follows: 

c(S 2 ,V\S 2 ) = 

c(S 2 \U,S 1 \U) + c(U, Si \ U) + c(S 2 ,V \ (Si U S 2 )) (2) 
c(S 2 \U,V\(S 2 \U)) = 

c(S 2 \ 17, Si \ 17) + c(S 2 \ 77, 77) + c(S 2 \ f/, V \ (S 1 U S 2 )) (3) 

Since (S 2 \U) C S 2 it is c(S 2 \ [7, V \ (Si U S 2 )) < c(S 2l V \ (Si U S 2 )) and with (1) 
we see that (2) < (3). 
Proof of (2i): 

Suppose Si \ S 2 ^ (cp. Fig 3(b)). Then it is c(U, S 2 ) < c(C7, V \ (Si U S 2 )), since 
otherwise (Si U S 2 ,V\ (Si U S 2 )) would be a cheaper s 2 -T 2 -cut than the community 
cut inducing S 2 . Since (Si \ U)U (S 2 \ Si) = S 2 , it is 

c(U, Si \ U) + c(U, S 2 \ Si) < c(U, V \ (Si n S 2 )). (4) 

We apply inequality (4) in order to show that the cut (Si \ U, V \ (Si \ 77)), which also 
separates s\ and Ti, is at most as expansive as the community cut inducing Si, which 
leads to a contradiction, since |Si \ U\ < \Si\. 

We represent the costs of the two cuts as follows: 

c(S 1 ,V\S 1 ) = 
c(Si \U,S 2 \ Si) + c(U, S 2 \ Si) + c(Si \ U, V \ (Si U S 2 )) 

+ c(t7,y\(SiUS 2 )) (5) 

c(S 1 \U,V\(S 1 \U)) = 

c(Si \U,S 2 \ Si) + c(Si \ 77, 77) + c (Si \ 77, V \ (Si U S 2 )) (6) 

If we add c(77, S 2 \ Si) to 6 and apply (4) we get a result that is at most as expansive 
than 5. Hence, (6) < (5). But Si \ 77 is smaller than Si contradicting the fact that Si is 
a source community. 
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(a) Case (1) (b) Case (2i) (c) Case (2ii) (d) Case (3) 

Fig. 3. Situation in Lemma 3. 

Proof of (2ii): 

Since the general case applies, it is Si C 5*2 (cp. Fig 3(c)). Furthermore, with Si also 
separating s 2 and T 2 and S 2 also separating Si and I\ we get A(s 1 ,T 1 ) = A(s 2 ,T 2 ), 
and thus, (Si, V"\ Si) is also a minimum s 2 -T 2 -cxit with | Si \ < |S 2 |. Hence, it must be 

51 = S 2 , otherwise S 2 would not be the source community of s 2 with respect to T 2 . 
Proof of (3): 

In the remaining cases it is Si n S 2 — 0. Hence, Si and 5*2 are not disjoint. Furthermore, 
it is either T x n S 2 £ and T 2 n Si 7^ or T x n S 2 7^ and Sl e Si \ S 2 . Thus, Si and 

52 are not nested. Figure 3(d) shows an example where Si and S2 exist and are neither 
nested nor disjoint. □ 



3 Complete Hierarchical Cut Clustering 

The clustering algorithm of Flake et al. [2] exploits the properties of minimum separat- 
ing cuts together with a parameter a in order to get clusterings where the clusters are 
SCs with the following additional property: For each cluster C e Q and each U C C it 

h ° ldS c(C,V\C) c(U,C\U) 

\V\C\ min{|(7|,|C\t/|} 

According to the left side of this inequality separating a cluster C from the rest of the 
graph costs at most a\V \ C\ which guarantees a certain inter-cluster sparsity. The right 
side further guarantees a good intra-cluster density in terms of expansion, a measure 
introduced by [7], saying that splitting a cluster C into U and C \ U costs at least 
amin{|[/|, \C \ U\}. Hence, the vertex sets representing valid candidates for clusters 
must be very tight — in addition to the predominant connectivity they must also provide 
an expansion that exceeds a given bound. 



Flake et al. develop their parametric 
cut clustering algorithm step by step start- 
ing from an idea involving cut trees [4]. 
The final approach, however, just uses 
community-cuts in a modified graph in or- 
der to identify clusters that satisfy con- 
dition (7). We refer to this approach by 
CutC. Here we give a more direct de- 
scription of this method. Given a graph 
G = (V,E,c) and a parameter a > 0, 
as a preprocessing step, augment G by in- 
serting an artificial vertex t and connect- 
ing t to each vertex in G by an edge of 



Algorithm 1: CutC 



Input: Graph G a = (V a , E a , c a ) 

1 n <- 

2 while 3 u e V a \ {t} do 

3 C u ^SC(u,t) in G a 

4 r{C u ) <r- u 

s forall the C l e O do 
6 if r(C l ) e C u then 

_ n^- n\{c i } 
v a *-v a \ c u 

8 return C 
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cost a. Denote the resulting graph by G a = (V a , E a ,c a ). Then apply CutC (Alg. 1) by 
iterating V and computing SC(u, t) for each vertex u not yet contained in a previously 
computed community. The source u becomes the representative of the newly computed 
SC (line 4). Since SCs with respect to a common vertex t are either disjoint or nested 
(Lemma 3(1), (2i)), we finally get a set Q of SCs in G a , which together decompose V. 
Since the vertices in G a are additionally connected to t, each SC in G a with respect 
to t is also a SC in G. However it is not necessarily a maximal SC in M{G). 

Applying CutC iteratively with decreasing a yields a hierarchy of at most n dif- 
ferent clusterings (cp. Figure 4). This is due to a special nesting property for different 
parameter values. Let C\ denote the SC of u in G ai and C 2 the SC of u in G Q2 . Then 
it is C\ C C2 if ol\ > a2- The hierarchy is bounded by two trivial clusterings, which 
we already know in advance. The clustering at the top consists of the connected compo- 
nents of G and is returned by CutC for a max = 0, the clustering at the bottom consists 
of singletons and comes up if we choose a equal to the maximum edge cost in G. 



Simple Parametric Search Approach. The crucial point with the construction of such 
a hierarchy, however, is the choice of a. If we choose the next value too close to a 
previous one, we get a clustering we already know, which implies unnecessary effort. 
If we choose the next value too far from any previous, we possibly miss a clustering. 
Flake et al. propose a binary search for the choice of a. However, this necessitates a 
discretization of the parameter range — an issue where again limiting the risk of missing 
interesting values by small steps is opposed to improving the running time by wide 
steps. In practise the choice of a good coarseness of the discretization requires previous 
knowledge on the graph structure, which we usually do not have. Thus, we introduce a 
simple parametric search approach for constructing a complete 2 hierarchy that does not 

require any previous knowledge. 

For two consecutive hierarchy levels 
J?i < we call a' the breakpoint if 

CutC returns Qi for a' and Qi + \ for a' — e • (^^^^ ^-^ ^—^ ^5 
with e — > 0. The simple idea of our ap- ^— ^ ^—^ ( — ^ Q Q ^— ? 
proach is to compute good candidates for 



A 



A 



V 



breakpoints during a recursive search with a 



ai C^y o o 0000 1 



v 
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the help of cut-cost functions of the clus- 
ters, such that each candidate that is no 
breakpoint yields a new clustering instead. 
In this way, we apply CutC at most twice 

per level in the final hierarchy. Beginning with the trivial clusterings J? < A 
(a > ctmax), the following theorem directly implies an efficient algorithm. 



Fig. 4. Clustering hierarchy by CutC. 

Note, a max < a whereas H max > f2 . 



max 



Theorem 1. Let fli < Qj denote two different clusterings with parameter values ai > 
aj. In time 0(\Qi\) a parameter value a m with 1) aj < a m < ai can be computed 
such that 2) Qi < Q m < Qj, and 3) fl m = fli implies that a m is the breakpoint 



between fli and Qj. 



2 The completeness refers to all clusterings that can be obtained by CutC for a value a. 
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Sketch of proof. We use cut-cost functions that represent, depending on a, the cost 
u>s(&) of a cut (S, V a \ S) in G a based on the cost of the cut (S, V \ S) in G and the 
size of S. 

u s :R+ ^{c(S,V\S),^)cR+ 
u s (a) := c(S,V\S) + \S\ a 

The main idea is the following. Let J7j < Qj denote two hierarchically nested cluster- 
ings. We call a cluster C € l?j that is nested in C e i7j a child of C and C the parent 
of C If there exists another level Q' between fli and flj, at least two clusters in fli 
must be merged yielding a larger cluster in fl' . The maximal parameter value where this 
happens is a value a* where a child C in i?, becomes more expensive than its parent C 
in Qj, and thus, is dominated by C in the sense that it will not become a cluster in any hi- 
erarchy level above a* (i.e., where a < a*). For two nested clusters C'CC this point 
is marked by the intersection point of the cut-cost functions cjc and ujc (Figure 5). 
Thus, this intersection point is a good candidate for a breakpoint between fli and fl' . 
We choose a m := mincer^ X c with X c := maxc' G f2,:C'cc{a I wc(a) = wc(a)} 
and prove that Claim 1) to 3) as stated in Theorem 1 hold with this choice of a m . The 
proofs are rather technical, thus we postpone them to Appendix A. 

For the running time, observe that a m is well-defined as each parent function inter- 
sects with at least one child function. In practice we construct a m by iterating the list 
of representatives stored for These representatives are assigned to a cluster in flj, 
thus, matching children to their parents can be done in time 0(|i?j|). The computation 
of the intersection points takes only constant time, given that the sizes and costs of the 
clusters are stored with the representatives by CutC. In total, the time for computing 
a m is thus in 0(|J2j|). □ 



Running time. The parametric search approach calls CutC 
twice per level in the final hierarchy, once when computing 
a level the first time and again right before detecting that the 
level already exists and a breakpoint is reached. The trivial lev- 
els J7 max and J? are calculated in advance without using CutC. 
Nevertheless, J7 is recalculated once when the breakpoint to 




the lowest non-trivial level is found. This yields 2{h - 2) + 1 Fig. 5. Intersecting cut- 
applications of CutC, with h the number of levels. We denote cost functions, 
the running time of CutC by T(n) without further analysis. For a more detailed dis- 
cussion on the running time of CutC see [2]. Since common min-cut algorithms run 
in 0(n 2 y/m) time, a single min-cut computation already dominates the costs for de- 
termining a m and further linear overhead. The running time of our simple parametric 
approach thus is in 0(2hT(n)), where h < n—1. This obviously improves the running 
time of a binary search, which is in 0(h log(d) T(n j), with d the number of discretiza- 
tion steps — in particular since we may assume d ~> n in order to minimize the risk of 
missing levels. We also tested the practicability of our simple approach by a brief ex- 
periment. The results confirm the improved theoretical running time. We provide them 
in Appendix A as bonus. 
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4 Framework for Analyzing SC Structures 



In general, clusterings in which all clusters are SCs are only partially hierarchically 
ordered. Hence, hierarchical algorithms like the cut clustering algorithm of Flake et 
al. [2] provide only a limited view on the whole SC structure of a network. In this 
section we develop a framework for efficiently analyzing different hierarchies in the 
SC structure after precomputing at most 2(n — 1) maximum flows. The basis of our 
framework is the set M(G) of maximal SCs in G. This can be represented by a cut tree 
of special community cuts, together with some additionally stored SCs, as we will show 
in the following. 

A (general) cut tree is a weighted tree T(G) = (V, Ef, cj-) on the vertices of an 
undirected, weighted graph G = (V, E, c) (with edges not necessarily in G) such that 
each {s,t} e Er induces a minimum s-t-cut in G (by decomposing T(G) into two 
connected components) and such that Cf({s, t}) is equal to the cost of the induced 
cut. The cut tree algorithm, which was first introduced by Gomory and Hu [4] in their 
pioneering work on cut trees and later simplified by Gusfield [5], applies n — 1 cut 
computations. For a detailed description of this algorithm see [4, 5] or Appendix B. 

The main idea of the cut tree algorithm is to iteratively choose vertices s and t that 
are not yet separated by a previous cut, and separating them by a minimum s-i-cut, 
which is represented by a new tree edge {s, t}. Depending on the shape of the found 
cut it might be necessary to reconnect previous edges in the intermediate tree. Gomory 
and Hu showed that a reconnected edge also represents a minimum s'-f'-cut for the 
new vertices s' and t' incident to the edge after the reconnection. Furthermore, the 
constructed cuts need to be non-crossing in order to be representable by a tree. While 
Gomory and Hu prevent crossings with the help of contractions, Gusfield shows that 
a crossing of an arbitrary minimum s-t-cut with another minimum separating cut can 
be easily resolved, if the latter does not separate s and t. Hence, the cut tree algorithm 
basically admits the use of arbitrary minimum cuts. 

For our special cut tree we choose the following community cuts: for a vertex pair 
{s, t} let (5", V\S) denote the community cut inducing S := SC(s, t) and let (T, V\T) 
denote the community cut inducing T := SC(t, s). If |5| < \T\, we choose (S, V \ S), 
and (T, V \ T) otherwise. Furthermore, we direct the corresponding tree edge to the 
chosen SC, and we associate the opposite SC, which was not chosen, also with the 
edge, storing it elsewhere for further use. In Appendix B we show that the so chosen 
"smallest" community cuts are already non-crossing, hence a transformation according 
to Gusfield is not necessary. This guarantees that the cuts represented in the final tree 
are the same community cuts as chosen for the construction. We further show that after 
reconnecting an edge, the corresponding cut still induces a "smallest" SC for the vertex 
the edge points to. Altogether, this proves the following. 

Theorem 2. For an undirected, weighted graph G = (V, E, c) there exists a rooted cut 
tree T(G) = (V, Ef, cj-) with edges directed to the leaves such that each edge (t, s) e 
Ef represents SC(s, t), and \SC(s, t)\ < \SC(t, s)\. Such a tree can be constructed by 
n — 1 maximum flow 3 computations. 

3 Max-flows are necessary in order to determine a smallest SC. For general cut trees preflows 
(after the first phase of common max-flow-push-relabel algorithms) suffice. 
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At the price of 0(n 2 ) additional space, the opposite SCs resulting from the cut tree 
construction can be naively stored in an (n — 1) x n matrix, which admits to check 
the membership of a vertex to an opposite SC in constant time. In many cases we even 
need only k < (n — 1) rows in the matrix, since some edges share the same SC, and 
we can deduce these edges during the cut tree construction. However, for few edges the 
determined opposite SC might become invalid again, due to a special situation while 
reconnecting the edge. For these edges we need to recalculate the opposite SCs in a 
second step. Hence, the construction of T(G) together with the opposite SCs associated 
with the edges in T(G) can be done by at most 2(n — 1) max-flow computations. 
We now show that each SC in M(G) is either given by an edge or is an opposite SC 
associated with an edge in T{G). 

Theorem 3. For an undirected weighted graph G = (V,E,c) it is n < |M(G)| < 
2(n — 1). Constructing M(G) needs at most 2(n — 1) max-flow computations. 

Proof. The SC-tree already represents n— 1 different maximal source communities of G 
and there is at least one maximal source community of the root that is not represented 
by the tree. Hence, there are at least n maximal source communities in G. 

In order to prove the upper bound we observe the following. From the structure of 
the SC-tree if follows that if p is a predecessor of q, the source community Q(q,p) 
is given by the cheapest edge on the path between p and q that is closest to q. We 
further show that (i) the source community P(p, q) is the opposite source community 
associated with the cheapest edge on the path from p to q that is closest to p. If u and v 
are vertices in disjoint subtrees with r the nearest common predecessor, we prove that 
the source community U (u, v) (ii) equals the source community U'(u, r) if no edge on 
the path from r to v is cheaper than the cheapest edge on the path from r to u, and (iii) 
equals the source community R(r, v), otherwise. Since r is a predecessor of u and v, 
together with (i) this finally proves that there are at most 2(n — 1) different maximal 
source communities in G. 

Proof of (i): Let (t, s) G Ej- denote the cheapest edge on ir(p,q) that is closest 
to p. Obviously it is X(p, q) = c-yit, s). Since (t, s) is closest to p, the community cut 
inducing P(p, q) does not separate p and t. Furthermore, it is p e T(t, s), otherwise 
the community cut inducing T(t, s) can be bend according to Lemma 9 such that it 
induces an edge of cost X(p, q) on ir(p, q) that is closer to p than (i, s). Hence, we have 
{P, t} C P{p, q) n T(t, s), while p (£ T(t, s). 

If s ^ P(p, q) we get the situation of Lemma 3(2ii), which yields P(p, q) = T(t, s). 
If s € P(p, q) we get T(t, s) C P(p 1 q), according to Lemma 3(2i). However, since 
T(t, s) also separates q and p, it must hold \P(p, q)\ = \T(t, s)\, which contradicts the 
assumption s e P(p, q). 

Proof of (ii): If no edge on ir(r, v) is cheaper than the cheapest edge on 7r(r, u), any 
cheapest edge on n(r, u) also induces a minimum u-v-cut, in particular the community 
cut of U'(u, r) is a minimum u-v-cut. We show now that the community cut inducing 
U (u, v) is also a minimum u-r, i.e., that it separates u and r. It follows that U(u, v) = 
U'{u,r). 

Suppose r € U(u,v). Since v £ U'(u,r) we get the situation of Lemma 3(2i) 
which yields U'(u,r) C U(u,v). However, since U'(u,r) also separates u and v it 
must hold \ U(u,v)\ = \U'(u, r)|, which contradicts the assumption r e U(u,v). 
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Proof of (Hi): If all edges on the path from r to u are more expansive than the 
cheapest edge on the path from r to v, the community cut inducing U(u, v) does not 
separate u and r, i.e., it is also a minimum r-u-cut. Hence, it follows that U (u, v) = 
R(r, v), since vice versa u € R(r, v) due to A(r, v) = X(u, v). □ 

After precomputing M(G), which includes the construction of T(G) (we denote 
this by M(G) D T(G)), the following tools allow to efficiently analyze the SC structure 
of G with respect to different SCs that are already known, for example, from the cut 
clustering algorithm of Flake et al. or the set M (G). The key is Lemma 4. It limits the 
shape of arbitrary SCs to subtrees in T(G), which admits an efficient enumeration of 
disjoint SCs by a depth-first search (DFS), as we will see in the following. 

Lemma 4. The subgraph T[T] induced by a SC T in T(G) is connected. 

Proof. If T is represented by an edge in T(G) the assertion obviously holds. Hence, 
assume T is an opposite SC or another arbitrary SC. In order to prove the connectivity 
of T[T], we first focus on the predecessors of t. Let p denote a predecessor of t with 
p E T[T] and q a successor of p on ir(p, t). We prove that q e T[T]. Assume q ^ T[T]. 
Since t is a successor of q, t is in the SC Q(t, q). According to Lemma 3(2i), however, 
it follows that T C Q, which contradicts p eT. 

In a second step we consider the remaining vertices. Let u be a vertex that is no 
predecessor of t. Let r denote the nearest common predecessor of u and t. We first 
show, that (i) if u e T[T], then r <G T[T]. Then we suppose there is also a predecessor 
p ^ r of 11 on tt(u, r) and prove (ii) that if u € T[T], then p € T[T]. Together with the 
observation on the predecessors of t, this ensures the connectivity of T[T]. 

Proof of (i): If r = t, we are done. Assume r ^ t and r ^ T[T] Since t is a 
successor of r, t is in the SC Q(t,r), while u ^ Q(t,r). According to Lemma 3(2i), 
however, it follows that TCQ, which contradicts u e T. 

Proof of (ii): From (i) we already know that r e T[T]. Assume p ^ T[T] and 
consider the SC P(p, r). Is is t ^ P, and hence, according to Lemma 3(1) P and T are 
disjoint, contradicting w e T, since u £ P. 

If T is maximal and u e T[T] is a successor of t or a successor of a predecessor p 
of t, u £ n(p, t), we can further show that the subtree rooted in u is in T[T]. 

This is obviously holds if T is represented by a tree edge. If T is a maximal opposite 
SC, let (t, s) e denote the edge T is associated with. Let u £ S(s, t) denote a 
successor of t or a successor of a predecessor p of i with u ^ 7r(p, t). With u e T[T] = 
s) and s ^ U(u,p), with U(u,p) corresponding to the subtree routed in u, we get 
the situation in Lemma 3(2), and it follows U(u, p) C T(t, s) = T[T]. □ 

Maximal SC Clustering for one SC. Given an arbitrary SC S, the first tool returns 
a clustering f2(S) of G that contains S, consists of SCs and is maximum in the sense 
that each clustering that also consists of S and further SCs is hierarchically nested in 
Q(S). This implies that S7(S) is the unique maximal clustering among all clusterings 
consisting of S and further SCs. We call f2(S) the maximal SC clustering for S. 

Theorem 4. Let S denote a SC in G. The unique maximal SC clustering for S can be 
determined in 0(n) time after preprocessing M(G) D T(G). 
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The maximal SC clustering for S =: Sq can be determined by the following construc- 
tion, which directly implies a simple algorithm. Let r denote the root of T(G) =: To 
and 7~[S ] the subtree induced by S in % (Lemma 4). Deleting 7~[S ] decomposes 
7o into connected components, each of which representing a SC, apart from the one 
containing r if r ^ So. If r G So, we are done. Otherwise, let 71 denote the component 
containing r and r the root of T[So]. Obviously is po €71 for (po,r ) G £7- and 
6>C(po> r o) = : 5*1 induces a subtree 7~[Si] in 71- Thus, Si and 71 adopt the roles of So 
and %■ 

Continuing in this way, we finally end up with a SC Sk containing r, such that 
deleting T[Sk] yields only SCs. The resulting clustering J2(S) consists of S — So, Si, 
i = 1, . . . , k, and the remaining SCs resulting from the decompositions of To, ■ ■ ■ , Tk- 

The proof of the maximality of Q{S) is based on the following lemma. 

Lemma 5. Each SC in Q(S) \ {S} is a SC with respect to the source ofS. 

Proof. Let c denote the source of a SC C G Q(S) and s the source of S. Recall that C 
is a maximal SC due to the construction of J?(S). Then, C := SC(c, s) and S are 
disjoint according to Lemma 3(1), since {c, s} n (C fl S) = 0. 

If C is a SC with respect to a vertex v G 5 we get C = C according to 
Lemma 3(2ii). 

If C is a SC with respect to a vertex v ^ S, let S" denote the cluster containing v. 
With the same arguments as before, C is a SC with respect to the source s' of 5". By 
induction and due to the construction, S' is a SC with respect to s and s' is on the path 
between c and s in T(G). If C contained s', then the edge in T(G) indicating C would 
also indicate the SC of s' with respect to s, which is S'. This contradicts the fact that 
c ^ S'. Hence, C does not contain s' and again by Lemma 3(2ii) it is C = C. □ 

Let Q denote an arbitrary SC with source q that does not intersect S, let s denote the 
source of S, and let C denote the SC in f2(S) \ {S} with q G C. Since C is a SC with 
respect to s ^ Q (Lemma 5) and q G Q H C, it is Q C C, according to Lemma 3(2i). 
Thus, each SC not intersecting S is nested in a cluster in Q(S). 

For the running time we assume that S is given in a structure that allows to check 
the membership of a vertex in time 0(1). Then identifying all clusters in S7(S) (which 
are subtrees) by applying a DFS 4 starting from the first vertex found in each cluster can 
be done in 0(n) time, since checking if a visited vertex is still in Si takes constant time 
for i = 1, . . . , k (recall, that we store the opposite SCs in a matrix). The remaining 
subtrees share their leaves with T(G). 

Overlay Clustering for k disjoint SCs. Given k disjoint arbitrary SCs Si, . . . , Sk, 
the second tool returns a clustering J? (Si, . . . ,Sfe) of G that contains Si, ... Sk, is 
nested in each maximal SC clustering Q(S\), . . . , f2(Sk) and is maximum in the sense 
that each clustering that consists of SCs and also contains Si , . . . , Sk is hierarchi- 
cally nested in J2(Si , . . . , Sk). Basically, according to the construction described below, 
J?(Si, . . . , Sk) is the unique maximal clustering among all clusterings that are nested 
in the maximal SC clusterings J2(Si), . . . , f2(Sk). The further properties result from 
the maximality of the SC clusterings, as for each ft(Sj) and each arbitrary SC S that 

4 This induces a rooted subtree independent from the orientation in T(G). 
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does not intersect Si, . . . , Sfc (or equals a given SC) there exists a cluster C G &(Sj) 
with SCC. Note that the clusters in Q(Si, . . . Sk) \ {Si, . . . , Sk} are not necessarily 
SCs. We call J2(Si, . . . , Sk) the overlay clustering for Si, . . . S fc . 

Theorem 5. Let Si , . . . , Sk denote disjoint SCs in G. The unique overlay clustering 
for Si, . . . , Sfc can be determined in 0(kri) time after preprocessing M D T{G). 

The overlay clustering for Si , . . . , Sk can be determined by the following inductive 
construction, which directly implies a simple algorithm. We first compute the maxi- 
mal SC clustering D{S\) and color the vertices in each cluster, using different col- 
ors for different clusters. Now consider the overlay clustering Q(S\, . . . , Si) for the 
first i maximal SC clusterings and color the vertices in Sj+i, which is nested in a 
cluster of J?(Si, . . . , Si), with a new color. During the computation of J?(S i+ i), we 
then construct the intersections of each newly found cluster C with the clusters in 
J?(Si, . . . , Si). To this end we exploit that the intersection of two subtrees in a tree 
is again a subtree. Hence, the clusters in J?(Si, . . . , Si, Sj+i) will be subtrees in T(G), 
since the clusters in Q(Si), . . . , J2(Sj) and i?(Sj+i) are subtrees in T(G) by Lemma 4. 

Let r' denote the first vertex found in C during the computation of J2(Sj + i). We 
mark r' as root of a new cluster in 0(Si, . . . ,Si, Sj+i) and choose a new color x for r', 
besides the color it already has in J? (Si, . . . , Sj). When constructing C (by applying 
a DFS), we assign the current color x to all vertices visited by the DFS as long as 
the underlying color in J?(Si, . . . , Si) does not change. Whenever the DFS visits a 
vertex r" (still in C) with a new underlying color, we chose a new color y for r" and 
mark r" as root of a subtree of a new cluster in J?(Si, . . . , Si, Sj+i). When the DFS 
passes r" on the way back to the parent 5 p of r", the color of p in J2(Si, . . . , Sj, Sj + i) 
becomes the current color again. Continuing in this way yields a coloring that indicates 
the intersections of C with J? (Si, . . . , Si). Repeating this procedure for all clusters in 
i?(Si+i) finally yields J?(Si, . . . , Sj+i). The running time is in O(kn), since we just 
apply k computations of maximal SC clusterings. 

Example. We extract two of the many faces of the SC structure of the weighted co- 
appearance network (called "lesmis") of the characters in the novel Les Miserables [8]. 
Figure 6(a) shows the cut tree T("lesmis"), the root r is depicted as filled square. 
Figure 6(b) shows the maximal SC clustering fi(Ri) for the SC R\ (filled vertices 
in squared box). The subtree T[i?i] induced by R x in T("lesmis") is indicated by 
filled vertices in Figure 6(a). Since r e R\, deleting T[i?i] immediately decomposes 
T("lesmis") into the unframed singleton SCs and the round framed SCs shown in Fig- 
ure 6(b). The SC R\ is the larger of the only two non-singleton clusters in the best cut 
clustering (with respect to modularity [10]) found by the cut clustering algorithm of 
Flake et al. On the other hand, R\ is the smallest reasonable SC that was found by the 
cut clustering algorithm containing r. The next smaller SC in the hierarchy that con- 
tains r consists of only three vertices. The second non-singleton cluster besides Ri in 
the best cut clustering is also in Q(R\), namely A. Nevertheless, Q(R\) is not nested 
in any clustering of the hierarchy. This is, we found a new clustering that contains all 
non-singleton clusters of the best cut clustering but far less unclustered vertices. Due to 

5 The predecessor adjacent to r" in the rooted subtree induced by the DFS. 
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(a) Basic cut tree T(G) (b) Maximal SC clustering (c) Overlay clustering 
Fig. 6. Exemplary clusterings of the lesmis-network; A, B, C appear in both clusterings. 

the maximality of Q(Ri), there is also no clustering with less singletons that consists 
of SCs and contains R\. 

Figure 6(c) shows the overlay clustering Q(S\, . . . , S§, R2) with Si, . . . , Se de- 
fined by the non-singleton subtrees of r in T("lesmis"). The SC R2 (filled vertices in 
squared box) has been computed additionally. It equals SC(r, T) with T := \J i=1 Si. 
If we consider the filled vertices in Figure 6(c) as one cluster F := V \T, then 
Si, ■ ■ ■ , Sq together with F represent the overlay clustering J2(Si , . . . , Sq). However, 
f2(Si, . . . , Sq) does not only consist of SCs since F is no SC: Observe that for the 
two vertices Vi, V2 € F \ R2 there exists a vertex u £ T (unfilled square) such that 
SC(vi,u) C F (i — 1,2) is a singleton. Hence, according to Lemma 3(2i), any SC 
in F, apart from {vi} and {U2}, must be in i?2- This is, in contrast to f2(Si, . . . , Sg), 
the overlay clustering f2(Si, . . . , Se, R2) consists of SCs and any clustering that also 
consists of SCs and contains Si, . . . , Sq is nested in J? (Si, . . . , Sg, R2). 

5 Conclusion 

Based on minimum separating cuts and maximum flows, respectively, we characterized 
SCs, a special type of predominantly connected communities. We introduced a method 
for efficiently computing a complete hierarchy of clusterings consisting of SCs accord- 
ing to Flake et al. [2] . Furthermore, we exploited the structure of cut trees [4] in order to 
develop a framework that admits the efficient construction of maximal SC clusterings 
and overlay clusterings for given SCs, after precomputing at most 2(n — 1) maximum 
flows. In most cases, however, we expect only around n — 1 maximum flows for the 
preprocessing, since the cases that cause the additional flow computations (when the 
opposite SC becomes invalid during the construction of the cut tree) are rare in prac- 
tice. For the "lesmis" network in the previous example we needed only n + 3 maximum 
flows with n — 77. We remark that a single maximal SC clustering for S can be also 
constructed directly by iteratively computing maximal SCs of the vertices not in S with 
respect to the source of S. However, in the worst case, this needs \V \ S\ flow compu- 
tations, if the SCs are singletons or if they are considered in an order that causes many 
unnecessary computations of nested SCs. In contrast, due to its short query times, our 
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framework efficiently supports the detailed analysis of a networks's SC structure with 
respect to many different maximal SC clusterings and overlay clusterings. 
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Appendix 



A Proof of Theorem 1 and Experimental Evaluation 

Our simple approach for constructing a complete hierarchy of cut clusterings exploits 
the properties of cut-cost functions. The cut-cost function cos of a set S C V is a linear 
function in a that represents the costs of cut (S, V a \ S) in G a based on the costs of cut 
(S, V\S)inG and the size of S. 

. s :M+^[c(S,nS),oo)cl+ 
u s (a) := c(5,y\S) + |S| a 

In the context of clusters in a cut clustering hierarchy we call a cluster C a child of a 
cluster C if C" C C. The cluster C is a parent of C". Hence, a cluster might have several 
children and several parents. Let C denote a child of a cluster C, i.e., \C"\ < \C\. Then, 
the slope of u^, which is given by |C"|, is positive but less than the slope of u>c (cp. 
Figure 7). If ujc an d wc intersect, let a* denote the intersection point. Note that the 
functions of a parent and a child do not intersect in general. For each a <G [0, a*) it is 
then 0Jc(a) < ojc{oi), and we say that the parent C dominates the child C . This is, 
C will never become a cluster in G a , as C induces a smaller u-t-cvX for each u <G C, 
which prevents C from becoming a community of any vertex. 

If the child C contains the representative r(C) of the 
parent C, we further observe that the child C dominates 
the parent C with respect to r(C) for each a G [a* , oo). 
This is, C will never become a cluster with representative 
r(C) in G Q , as w C /(a) < for each a <G (a*,oo) 

and \C'\ < \C\. Thus, C either induces a smaller r(C)-t- 
cut or, if the cut costs are equal, C induces a smaller cut Fig. 7. Intersecting cut- 
side, which both prevents C from becoming a community cost functions. 
ofr(C). 

The latter observation implies that the function of a cluster C with representative 
r(C) always intersects with the function of any child C of C with r(C) € C . Other- 
wise, C would dominate C wrt. r(C) in the whole parameter range contradicting the 
fact that C is a cluster with representative r(C) in the hierarchy. 

For two consecutive hierarchy levels Qi < J?j+i we call a' the breakpoint if CutC 
returns i?^ for a' and J2j+i for a' — e with e — > 0. A breakpoint between two hierarchy 
levels is in particular an intersection point of the cut-cost functions of two clusters 
C C C. The simple idea of our parametric search approach is to compute relevant 
intersection points and check if they yield new clusterings. 

Theorem 1. Let Qi < Qj denote two different clusterings with parameter values 
on > oij. In time 0(\Qi\) a parameter value a m with 1) oij < a m < on can be 
computed such that 2) Qi < Q m < Qj, and 3) Q m = Qi implies that a m is the 
breakpoint between Qi and Qj. 




Proof. This proof constructively describes the steps of our parametric search approach 
and shows the correctness. The first step is the construction of a m . Formally, we define 
a m := mm Ces - 2j A c with X c := max c « efi .. C / cC {a | u c (a) = u C '{a)}. The nota- 
tion Ac describes the maximum intersection point of the function ujq, C G Qj, with the 
functions of all children of C on level Qi. The minimum of these points then yields a m . 
Note that a m is well-defined as each parent function intersects with at least one child 
function. In practice we construct a m by iterating the list of representatives stored for 
Qi. Since Qj assigns each of these representatives to a cluster, matching the children 
to their parents can be done in time 0(|J7j|). This already dominates the costs for the 
remaining steps, which are the computation of the intersection points and the search 
for the maximum and minimum values. Recall that the clusters in both clusterings are 
also mapped to their sizes and costs, which allows to compute an intersection point in 
constant time. 

In the following let C G Qj denote a parent that induces a m , i.e., X c — a m . 
Furthermore, let C 1 G Qi denote a child of C that contains the representative r{C) 
and let C 2 G Qi denote a child of C with uoq2 (ot m ) — ujc («m)- Thus, the intersection 
point for C 2 and C is a m . We denote the intersection point for C 1 and C, which also 
exists, by a 1 . 

Claim 1: ctj < a m < on. Suppose first otj > a m . This implies aj G [a 1 , oo), and 
thus, C 1 would dominate C wrt. r(C) in G aj contradicting the fact that C is a cluster in 
Qj with representative r(C). Suppose secondly a, < a m . This implies G [0,a m ), 
and thus, C would dominate C 2 in G ai contradicting the fact that C 2 is a cluster in Qi. 

After having computed a m we apply CutC with this newly obtained value. The re- 
sulting clustering is denoted by Q m . According to Claim 1 and the hierarchical structure 
it is fli < fl m < flj. 

Claim 2: fl m ^ flj. Recall that a 1 < a m . This implies a m G [a 1 , oo), and thus, 
C 1 dominates C wrt. r(C) in G am . Nevertheless, C might be a cluster in fl m wrt. to 
another representative u ^ r(C). However, this can be disproven by the same argument, 
since the intersection point for C and the child containing u is, analogously to a 1 , also 
at most a m . Recall that a m is the maximum intersection point regarding the children 
of C. Thus, flj contains at least one cluster C ^ fl m . 

Claim 3: If fl m — fli then a m is the breakpoint between fli and flj . We first 
show that a m is the breakpoint between fli and the next higher level in the complete 
hierarchy. In a second step we prove that the clustering on the next higher level equals 
flj . To see the first assertion consider a m — e < a m for e — > 0. This implies a m — e G 
[0, a m ), and thus, C dominates C 2 in G Qm _ £ . Consequently, fl m = fli contains a 
cluster C 2 that will never appear for a m — e, and thus, a m is the breakpoint between fli 
and the next higher level in the complete hierarchy. In order to prove the latter assertion 
saying that the next higher level equals flj, we show that each cluster in Qj corresponds 
to a community in G Q7ll _ e . The nesting property for communities together with the 
hierarchical structure then ensures that CutC returns Qj for a m — e, which means that 
there exists no further clustering between Qi and Qj. For this final step we overload 
the notation of C and C 2 as follows: Let C G Qj denote an arbitrary cluster and let 
C 2 G Qi = Q m denote a child of C with wc 2 (Ac) = wc(Ac)- This is, the intersection 
point for C and C 2 is Ac > a m . Let further r denote the representative of C 2 in 
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Q m = Qi, Recall that Q m — fli does not imply the equivalence of the representative 
of G 2 in Qi and the representative of G 2 in Q m . We show that (a) Ac = ot m , and based 
on this, that (b) G equals the community of r in G am _ e . 

Sub-claim (a): Ac = a m . Suppose Ac > <x m , which implies a m e [0, Ac)- Then 
the parent G would dominate the child G 2 in G Qm contradicting the fact that G 2 is a 
cluster in fl m . 

Sub-claim (b): G equals the community of r in G Qm _ e . Let C denote the commu- 
nity of r in G Qm _ e . Then the hierarchical structure implies G 2 C G C G. It is further 
wc2 (a m ) < uj(j(a m ), as otherwise G would induce a smaller r-t-cut in G Um than the 
actual community G 2 of r. On the other hand, it is ojc(a m — s) < LOc(a m — e), by 
the same argument, i.e., otherwise G would induce a smaller r-t-cut in G Qm _ e than the 
actual community G of r. With the help of (a) we see that wc2 (a m ) — u)c(a m ), and 
thus, the cost-function of G must lie above uiq in a m and below ujq in a m — e (cp. 
Figure 8). This implies that the slope of wg is at least the slope of lxJq- Now suppose 
G 7^ G, which implies G C G and thus |G| < |G|. The latter, however, means that the 
slop of ujq is less than the slope of wc contradicting the previous observation. Hence, 
it is G = G. 

□ 

l u)(ck) <->Jc 

Theorem 1 allows a recursive search beginning with 
the computation of a m for the trivial clusterings J?o < 
fi max (a > a max ). Recall that in J? each vertex is a 
cluster; J? max consists of the set of connected components. 
After applying CutC for a m the resulting clustering can 
be easily compared to the current lower level by count- 
ing clusters. If a new clustering was found, the recursion 
branches and the list storing the levels of the hierarchy is 
updated. Otherwise, the current branch stops since the breakpoint between two con- 
secutive clusterings has been found. In contrast to a binary search on the discretized 
parameter range this approach definitely returns a complete hierarchy. 




Fig. 8. Intersecting cut- 
cost functions. 



Experimental Evaluation. For our experiments we used real world instances as well as 
generated instances. Most instances are taken from the testbed of the 10th DIMACS Im- 
plementation Challenge [1], which provides benchmark instances for partitioning and 
clustering. The implementation was realized within the LEMON framework [2], ver- 
sion 1.2.1. We implemented CutC as described in Algorithm 1, extended by a heuristic 
that chooses the vertices in non-increasing order w.r.t. the weighted degree. Due to this 
heuristic, which was proposed by Flake et al., the number of min-cut computations in 
CutC becomes proportional to the number of clusters in the resulting clustering [3]. The 
min-cut implementation provided by LEMON runs in 0(n 2 y/m). Note that we did not 
focus on a notably fast implementation. Instead, the implementation should be simple 
and practical using available routines for sophisticated parts like the min-cut computa- 
tion. Table 2 lists ascending CPU times determined on an AMD Opteron Processor 252 
with 2.6 GHz and 16 GB RAM. 

For comparison, we further ran a binary search on the same instances, using the 
same CutC implementation in the same framework. The running times are listed twice 
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in Table 2, once as CPU times and again as factors saying how much longer the binary 
search ran compared to the parametric search. However, this is not meant to be a com- 
petitive running time experiment, since the running time of the binary search mainly de- 
pends on the discretization. We just want to demonstrate that being compelled to choose 
the discretization intuitively, without any knowledge on the final hierarchy, makes the 
binary search less practical. From a users point of view focussing on completeness, we 
defined the size of the discretization steps as 1/n 2 . The dependency on n is motivated 
by the fact that the potential number of levels increases with n, and by the hope that the 
breakpoints are distributed more or less equidistantly. For small graphs with n < 1000, 
one can even afford some more running time. Thus, we reduced the step size for those 
graphs to 1/(1000 n) in further support of completeness. This yields 2 10 to 2 30 dis- 
cretization steps depending on the length of the parameter range [a max , ot ]. With this 
discretization the binary search exceeds the parametric search by a factor of four up 
to 32. 

Furthermore, as expected, the running time does not only depend on the input size 
but also on the number of different levels in the hierarchy. This can be observed for both 
approaches comparing the instances as-22july06 and cond-mat. Although the latter is 
smaller, it takes longer to compute 80 levels compared to only 33 levels in the former 
graph. 

Table 2. Running times for the parametric search approach (PasS) and the binary search approach 
(BinS) in minutes and seconds. The factors listed for BinS describe how much longer BinS ran 
compared to ParS. Instances are sorted by CPU times for ParS. Times longer than six days are 
marked by *. 



graph 


n 


m 


h 


ParS [m:s] 


BinS [m:s] 


BinS [fac] 


celegans_metabolic 


453 


2025 


8 


0.300 


7.620 


8.380 


celegansneural 


297 


2148 


17 


0.406 


8.653 


9.919 


netscience 


1589 


2742 


38 


4.310 


4.030 


11.952 


power 


4941 


6594 


66 


1:25.736 


8.773 


15.742 


as-22july06 


22963 


48436 


33 


39:54.495 


12.419 


20.583 


cond-mat 


16726 


47594 


80 


44:15.317 


14.917 


27.425 


rgg_n_2_15 


32768 


160240 


46 


245:25.644 


32.748 


22.573 


G_n_pin_pout 


100000 


501198 


4 


369:29.033 


* 


* 


cond-mat-2005 


40421 


175691 


82 


652:32.163 


* 


21.446 
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B Proof of Theorem 2 and the Cut Tree Algorithm 



In this section we show that applying the cut tree algorithm of Gomory and Hu with 
smallest community cuts as described in Section 4 yields a cut tree as stated in Theo- 
rem 2. 

Theorem 2. For an undirected, weighted graph G = (V, E, c) there exists a rooted cut 
tree T(G) — (V, Eq-, c-f) with edges directed to the leaves such that each edge (t, s) G 
Ej- represents SC(s, t), and \SC(s, t)\ < \SC(t, s)\. Such a tree can be constructed by 
n — 1 maximum flow 6 computations. 

To this end, we briefly review the cut tree algorithm of Gomory and Hu [4] and prove 
Lemma 6 and Lemma 7, which together guarantee the correctness of our construction 
and show how the opposite SCs can be also retained. Recall that for our special cut tree 
we choose the following community cuts in line 6, Algorithm 2: for a vertex pair {s, t} 
let (S, V \ S) denote the community cut inducing SC(s, t) and let (T, V \T) denote 
the community cut inducing SC(t, s). If \SC(s, t)\ < \SC(t, s)| we choose (S, V \ S), 
and (T, V\T) otherwise. We call the chosen SC a "smallest" SC with respect to s and 
t and orientate the resulting edge in the intermediate tree such that is points to the SC. 
Furthermore, we choose in line 3, Algorithm 2, the last considered vertex in S together 
with a new vertex in S as {u, v}. 

Lemma 6. Let S denote a smallest SC with respect to s € S and t € V \ S and S' a 
smallest SC with respect to s and another vertex Then S' C S. If further s G S', 

then S is a smallest SC with respect to x and t and SC(t, s) = SC(t, x). 




(a) x G 5" (b) se S' 

Fig. 9. Situation in Lemma 6. 



Lemma 7. Let U denote a smallest SC with respect to u G U and s G V \ U and let S' 
denote a smallest SC with respect to s and another vertex x € V \ U. Then U C S' or 

u n s' = 0. 

//s £ S' and U l~l S' = 0, then U is also a smallest SC with respect to u and x 
and if (i) x G SC(s,u), SC(s,u) = SC(x,s) and otherwise (ii) SC(s,u) = S' and 
SC(x, u) = SC{x, s) if also u £ SC(x, s). 

6 Max-flows are necessary in order to determine a smallest SC. For general cut trees preflows 
(after the first phase of common max-flow-push-relabel algorithms) suffice. 
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If x G S' and U C S', then U is also a smallest SC with respect to u and x and if 
(i) x G SC(s, u), SC(s, u) = SC(x, u) and otherwise (ii) SC(s, u) = SC(s, x), but no 
assertion on SC(x, u), which is the missing opposite SC of the edge (x, u). 



In 

U--' 



s eS'. UQ S' 
(a) 



U- 



,5" 



s e S'.unS' = ® xeS',UnS' = 
(b) (c) 
Fig. 10. Situation in Lemma 7. 



s'-- 

u 

U- 



x eS',UC S' 
(d) 



Reviewing the Cut Tree Algorithm. We briefly revisit the construction of a cut tree [4, 
5]. This algorithm iteratively constructs n — 1 non-crossing minimum separating cuts 
for n — 1 vertex pairs, which we call step pairs. These pairs are chosen arbitrarily from 
the set of pairs not separated by any of the cuts constructed so far. Algorithm 2 briefly 
describes the cut tree algorithm of Gomory and Hu. 



Algorithm 2: Cut Tree 



{V},E. 



and c* empty 

/ / unfold all nodes 



Input: Graph G = (V, E, c) 
Output: Cut tree of G 

1 Initialize tree T» := (V*,E», c) with V* «- 

2 while 3S G V* with \ S\ > 1 do 

3 {u, v} arbitrary pair from (j) 

4 forall the Sj adjacent to S in T 1 * do TV 

5 Gs — (Vs, Es, cs) <s— in G contract each Nj to [Nj 

6 (U, V \U) <— min-u-w-cut in Gs, cost X(u, v),u G U 

7 S u 4- S n U and S v <r- S n (V s \ U) II split S = S u U S v 
s V, 4- (V, \ {S}) U {S u , S v }, E,^E,U {{S u , S v }}, c*(S u , Sv) *- A(u, v) 

9 forall the former edges e,- = {S, Sj} G -E* do 



subtree of S in T* with Sj G Nj 



II contraction 



if [Nj] G U then 4- {S u , Sf\ 



else e, 



// reconnect Sj to 

/ / reconnect Sj to S v 



12 return T» 



The intermediate cut tree = (V*,E*,c*) is initialized as an isolated, edgeless 
node containing all original vertices. Then, until each node of is a singleton node, 
a node S G V* is i/?//f. To this end, nodes S' ^ S are dealt with by contracting in G 
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(a) If a; £ S u , { x ,y} is still a cut pair of (b) If x S u , {u, y} is a cut pair of {S u , Sj} 
{S u , Sj} 

Fig. 11. Situation in Lemma 8. There always exists a cut pair of the edge {S u , Sj} in the nodes 
incident to the edge, independent of the shape of the split cut (dashed). 

whole subtrees Nj of S in T*, connected to S via edges {S, Sj}, to single nodes [Nj] 
before cutting, which yields Gs. The split of S into S u and S v is then defined by a 
minimum u-v-cut (split cut) in Gs, which does not cross any of the previously used 
cuts due to the contraction technique. 

Afterwards, each Nj is reconnected, again by Sj, 
to either S u or S v depending on which side of the cut 
[Nj] ended up. Note that this cut in Gs can be proven 
to induce a minimum u-v-cut in G. The correctness 
of Cut Tree is guaranteed by Lemma 8, which takes 
care for the cut pairs of the reconnected edges. It states 
that each edge {S, S'} in has a cut pair {x, y} with 
x £ S, y € S'. An intermediate cut tree satisfying this 
condition is valid. The assertion is not obvious, since 
the nodes incident to the edges in change whenever 
the edges are reconnected. Nevertheless, each edge in Fig. 12. Depending on x Lem. 9 
the final cut tree represents a minimum separating cut bends the cut (H, V\H) upwards 
of its incident vertices, due to Lemma 8. The lemma ° r downwards, 
was formulated and proven in [4] and rephrased in [5]. See Figure 11. 

Lemma 8 (Gus. [5], Lem. 4). Let {S, Sj} be an edge in inducing a cut with cut 
pair {x, y}, w.l.o.g. igS. Consider step pair {u, u}CS that splits S into S u and S v , 
w.l.o.g. Sj and S u ending up on the same cut side, i.e. {S u , Sj} becomes a new edge in 
T*. If x £ S u , {x, y} remains a cut pair for {S u , Sj}. If x £ S v , {u, y} is also a cut 
pairof{S u ,S J }. 

While Gomory and Hu use contractions in G to prevent crossings of the cuts, as a sim- 
plification, Gusfield introduced the following lemma showing that contractions are not 
necessary, since any arbitrary minimum separating cut can be bent along the previous 
cuts resolving any potential crossings. See Figure 12. 

Lemma 9 (Gus. [5], Lem. 1). Let (X, V\X) be a minimum x-y-cut in G, with x £ X. 
Let (H, V \ H) be a minimum u-v-cut, with u,v £ V \ X and x £ H. Then the cut 
(H U X, (V \ H) fl (V \ X)) is also a minimum u-v-cut. 

Proof of Lemma 6 and Lemma 7. In the following proofs, whenever we bend a cut 
along another cut deflected by a vertex, we apply Lemma 9. 
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Lemma 6. Let S denote a smallest SC with respect to s G S and t G V\ S and S' a 
smallest SC with respect to s and another vertex x G S. Then S' C S. If further s G 5", 
then S is a smallest SC with respect to x and t and SC(t, s) = SC(t, x). 

Proof. We distinguish two cases, namely x E S' (Figure 9(a)) and s G 5" (Figure 9(b)). 
The cases are (almost) symmetric with respect to the first assertion. Hence we prove 
the first assertion, which is 5" C S, just for the first one. Nevertheless, we need to 
distinguish the second case, since here the edge (t, s) is reconnected and we need to 
further show that the SCs remain valid. 

Case 1: x G S'.Tftg 5", it is S' C S according to Lemma 3(2i). 

Now suppose t G S' and consider SC(s,x). Note, that t ^ SC(s,x), since 
SC(s, x)r\S' = 0. This is, SC(s, x) C S due to bending the corresponding cut along £ 
deflected by t. We show that \SC(s, x)\ < \S'\ and hence, S' would not be a smallest 
SC with respect to s and x, which is a contradiction. Hence, the case t' G S' does not 
occur. 

Let 9 denote the cut inducing S and 9' the one inducing S' . We observe that 9' 
could be bent along S deflected by t, and thus, 9 is a minimum x-t-cut according to 
the correctness of the cut tree algorithm (Lemma 8). Hence, 9 could be bent along (the 
original) 9' deflected by s yielding a cut side T 3 t of a minimum ,s-f-cut. Since S is a 
smallest SC with respect to s and t, it follows |5| < \T\, while T C 5', 5C(s, x) C S. 
Finally itis \SC{s,x)\ < \S\ < \T\ < \S'\. 

Case 2: s G S'. Now we know that S' C S. We show next that in Case 2 £ is also 
a smallest SC with respect to x and t and SC(t, s) — SC(t, x). 

According to the correctness of the cut tree algorithm (Lemma 8), S is also induced 
by a minimum x-t-cut, which is X(x,t) — X(s,t) and S — SC(x,t). Hence, it is 
further SC(t, s) = SC(t, x), as x e S, and together with S — SC(x, t) (see above), S 
is a smallest SC with respect to x and u, since \SC(t, s)\ > \S\. □ 

Lemma 7. Let U denote a smallest SC with respect to u e U and s e V \U and let 
S' denote a smallest SC with respect to s and another vertex x G V \ U. Then U C S' 
or U n 5' = 0. 

If s G S' and U Ci S' = 0, then U is also a smallest SC with respect to u and x 
and if (i) x G SC(s,u), SC(s,u) = SC(x,s) and otherwise (ii) SC(s,u) = S' and 
SC{x, u) = SC(x, s) if also u SC(x, s). 

If x G S' and U C S", then U is also a smallest SC with respect to u and x and if 
(i) x G SC(s, u), SC(s, u) = SC(x, u) and otherwise (ii) SC(s, u) = SC(s, x), but no 
assertion on SC(x, u), which is the missing opposite SC of the edge (x, u). 

Proof. We distinguish two cases, namely s G 5" (Figure 10(a), 10(b)) and x G 5' 
(Figure 10(c), 10(d)). The first assertion, which is U n 5" = or U C 5", follows 
in both cases directly from Lemma 3(1), (2i), depending on whether u G S'. In the 
following we proof the further assertions. 

Case 1: s G S' . If U P\ S' — 0, due to the correctness of the cut tree algorithm 
(Lemma 8), U induces a minimum u-x-cut, and hence, it is X(x, u) = X(s,u) < X(x 7 s) 
and U = SC(u, x). 
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(i) x € SC(s,u): Suppose s £ SC(x,u). Then it follows X(x, s) = X(x,u) and 
5" C SC(s,u) according to Lemma 3(2i). Hence, 5" would be a smaller SC with 
respect to s and u than SC(s, u), which is a contradiction. Thus, it is s € SC(x, u) 
and by Lemma 3 (2ii), it is SC(x, u) = SC(s, u) and together with U = SC(u, x) (see 
above), U is a smallest SC with respect to u and x, since \SC(s, u)\ > \U\. 

(ii) x $l SC(s, u): It follows X(x, s) = X(s, u) and SC(s, u) = S' by Lemma 3(2ii). 
If further u ^ SC(x,s), by Lemma 3(2ii), we see that SC(x,s) = SC(x,u). It is 
further \SC(x, u)\ > \S'\ > \U\, and thus, together with U = SC(u, x) (see above), U 
is a smallest SC with respect to u and x. 

Case 2: x e 5". If U C S', due to the correctness of the cut tree algorithm 
(Lemma 8), U induces a minimum u-x-cut, and hence, it is X(x, u) = X(s,u) < X(x,s) 
and U — SC(u, x). 

We claim that s e SC(x,u), which helps to prove (i) and (ii). Suppose s ^ 
SC(x,u). Then it is X(x,s) — X(x,u) and the cut inducing SC(x,u) can be bent 
along S' deflected by s, such that SC(x, u) C S', and hence, induces a smaller SC with 
respect to s and x. 

(i) x £ SC(s, u): Since also s e SC(x, u), it is SC(x, u) = SC(s, u), according to 
Lemma 3(2ii). Furthermore, together with U = SC(u,x) (see above), U is a smallest 
SC with respect to x and u, since SC\(s, u)\ > \U\. 

(ii) x SC(s,u): Hence, X(s,x) — X(s,u), and thus, SC(s 7 u) = SC(s,x). With 
s e SC(x, u) and x ^ SC(s,u), by Lemma 3(2i) it follows SC(s,u) C SC(x,u). 
Hence, \U\ < SC(s, u) < \SC(x, u)\ and together with U = SC{u, x) (see above) , U 
is a smallest SC with respect to u and x. □ 
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